Efficient instance-based learning on data streams

نویسندگان

  • Jürgen Beringer
  • Eyke Hüllermeier
چکیده

The processing of data streams in general and the mining of such streams in particular have recently attracted considerable attention in various research fields. A key problem in stream mining is to extend existing machine learning and data mining methods so as to meet the increased requirements imposed by the data stream scenario, including the ability to analyze incoming data in an online, incremental manner, to observe tight time and memory constraints, and to appropriately respond to changes of the data characteristics and underlying distributions, amongst others. This paper considers the problem of classification on data streams and develops an instance-based learning algorithm for that purpose. The experimental studies presented in the paper suggest that this algorithm has a number of desirable properties that are not, at least not as a whole, shared by currently existing alternatives. Notably, ∗Draft of a paper to appear in “Intelligent Data Analysis”.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

[7] A. Asuncion and D. J. Newman. UCI Machine Learning Repository

[3] Rakesh Agrawal and Arun Swami. A one-pass space-efficient algorithm for finding quantiles. A one-pass algorithm for accurately estimating quantiles for disk-resident data. [8] Jürgen Beringer and Eyke Hüllermeier. An efficient algorithm for instance-based learning on data streams.

متن کامل

Online-Data-Mining auf Datenströmen: Methoden zur Clusteranalyse und Klassifikation

• J. Beringer and E. Hüllermeier. Efficient instance based learning on data streams. Adaptive optimization of the number of clusters in fuzzy clustering. Fuzzy clustering of parallel data streams. Adaptive optimization of the number of clusters in fuzzy clustering.

متن کامل

IRDDS: Instance reduction based on Distance-based decision surface

In instance-based learning, a training set is given to a classifier for classifying new instances. In practice, not all information in the training set is useful for classifiers. Therefore, it is convenient to discard irrelevant instances from the training set. This process is known as instance reduction, which is an important task for classifiers since through this process the time for classif...

متن کامل

Active Learning from Stream Data

In this paper, we propose a new research problem on active learning from data streams where data volumes grow continuously. The objective is to label a small portion of stream data from which a model is derived to predict future instances as accurately as possible. We propose a classifier-ensemble based active learning framework which selectively labels instances from data streams to build an e...

متن کامل

IBLStreams: a system for instance-based classification and regression on data streams

This paper presents an approach to learning on data streams called IBLStreams. More specifically, we introduce the main methodological concepts underlying this approach and discuss its implementation under the MOA software framework. IBLStreams is an instance-based algorithm that can be applied to classification and regression problems. In comparison to model-based methods for learning on data ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Intell. Data Anal.

دوره 11  شماره 

صفحات  -

تاریخ انتشار 2007